Solution: Tailored to Fit
Learn how you should use keys in the database.
We'll cover the following
NA primary key is a constraint, not a data type. We can declare a primary key on any column or set of columns, as long as the data types support indexing. We are also able to define a column as an auto-incrementing integer without making it the primary key of the table. The two concepts are independent of each other.
In short, don’t let inflexible conventions get in the way of good design.
Tell it like it is#
We should choose sensible names for our primary key. The name should convey the type of entity that the primary key identifies. For example, the primary key of the Bugs
table should be bug_id
.
A quick way to check if a value exists in one table while inserting an entry in a second table is by using a foreign key. We must use the same column name in foreign keys where possible. This often means that the name of a primary key should be unique within our schema; no two tables should have the same name for their primary key unless one is also a foreign key referencing the other. However, there are exceptions: sometimes it is appropriate for a foreign key to be named differently from the primary key it references so as to be descriptive of the nature of the association.
An industry-standard exists to describe naming conventions for metadata. The standard, called ISO/IEC 11179, is a guideline for “managing classification schemes” in information technology systems. In other words, this is how we should name our tables and columns sensibly. Like most ISO standards, this document is nearly impenetrable, but Joe Celko applies it practically to SQL in his book SQL Programming Style.
Be unconventional#
Object-relational frameworks expect us to use a pseudo key named id
, but they also allow us to override this and declare a different name instead. The following example uses Ruby on Rails:
Some developers think that specifying the primary key column is necessary only when supporting legacy databases, in which they can’t use their preferred conventions. However, keeping sensible column names is also vital in new projects.
Embrace natural keys and compound keys#
If our table contains an attribute that’s guaranteed to be unique, is non-null, and can serve to identify the row, we don’t need to feel obligated to add a pseudo key solely for the sake of tradition.
Practically speaking, it’s not uncommon for every attribute in a table to be subject to change or non-unique. Databases tend to evolve during the lifetime of a project, and decision-makers may not respect the sanctity of a natural key. Sometimes a column that seemed like a good natural key turns out to have legitimate duplicates. In cases like these, a pseudo key is the only solution.
It is best to use compound keys when they’re appropriate. When a row is best identified by combining multiple attribute columns, as in the BugsProducts
table, we should use those columns in a compound primary key.
Let’s run the query in the playground below.
Now, let’s try to add a duplicate entry INSERT INTO BugsProducts (bug_id, product_id) VALUES (1234, 1);
and see what we get.
Of course, there should be no duplicates.
Note that foreign keys that reference a compound primary key also need to be compound. It may seem clumsy to duplicate these columns in the dependent tables, but they can also have advantages: we may simplify a query that would have required a join to fetch attributes of the referenced row.